Memory and performance issues in parallel multifrontal factorizations and triangular solutions with sparse right-hand sides

نویسنده

François-Henry Rouet

چکیده

We consider the solution of very large sparse systems of linear equations on parallel architectures. In this context, memory is often a bottleneck that prevents or limits the use of direct solvers, especially those based on the multifrontal method. This work focuses on memory and performance issues of the two memory and computationally intensive phases of direct methods, namely, the numerical factorization and the solution phase. In the first part we consider the solution phase with sparse right-hand sides, and in the second part we consider the memory scalability of the multifrontal factorization. In the first part, we focus on the triangular solution phase with multiple sparse righthand sides, that appear in numerous applications. We especially emphasize the computation of entries of the inverse, where both the right-hand sides and the solution are sparse. We first present several storage schemes that enable a significant compression of the solution space, both in a sequential and a parallel context. We then show that the way the right-hand sides are partitioned into blocks strongly influences the performance and we consider two different settings: the out-of-core case, where the aim is to reduce the number of accesses to the factors, that are stored on disk, and the in-core case, where the aim is to reduce the computational cost. Finally, we show how to enhance the parallel efficiency. In the second part, we consider the parallel multifrontal factorization. We show that controlling the active memory specific to the multifrontal method is critical, and that commonly used mapping techniques usually fail to do so: they cannot achieve a high memory scalability, i.e., they dramatically increase the amount of memory needed by the factorization when the number of processors increases. We propose a class of “memoryaware” mapping and scheduling algorithms that aim at maximizing performance while enforcing a user-given memory constraint and provide robust memory estimates before the factorization. These techniques have raised performance issues in the parallel dense kernels used at each step of the factorization, and we have proposed some algorithmic improvements. The ideas presented throughout this study have been implemented within the MUMPS (MUltifrontal Massively Parallel Solver) solver and experimented on large matrices (up to a few tens of millions unknowns) and massively parallel architectures (up to a few thousand cores). They have demonstrated to improve the performance and the robustness of the code, and will be available in a future release. Some of the ideas presented in the first part have also been implemented within the PDSLin (Parallel Domain decomposition Schur complement based Linear solver) package.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Multifrontal Solution of Sparse Linear Least Squares Problems on Distributed-Memory Multiprocessors

We describe the issues involved in the design and implementation of eecient parallel algorithms for solving sparse linear least squares problems on distributed-memory multiprocessors. We consider both the QR factorization method due to Golub and the method of corrected semi-normal equations due to Bjj orck. The major tasks involved are sparse QR factorization, sparse triangular solution and spa...

متن کامل

Efficient Parallel Solutions of Large Sparse Spd Systems on Distributed-memory Multiprocessors

We consider several issues involved in the solution of sparse symmetric positive deenite systems by multifrontal method on distributed-memory multiprocessors. First, we present a new algorithm for computing the partial factorization of a frontal matrix on a subset of processors which signiicantly improves the performance of a distributed multifrontal algorithm previously designed. Second, new p...

متن کامل

On Evaluating Parallel Sparse Cholesky Factorizations

Though many parallel implementations of sparse Cholesky factorization with the experimental results accompanied have been proposed, it seems hard to evaluate the performance of these factorization methods theoretically because of the irregular structure of sparse matrices. This paper is an attempt to such research. On the basis of the criteria of parallel computation and communication time, we ...

متن کامل

Robust Memory-Aware Mappings for Parallel Multifrontal Factorizations

We focus on memory scalability issues in multifrontal solvers like MUMPS. We illustrate why commonly used mapping strategies (e.g., a proportional mapping) cannot achieve a high memory efficiency. We propose a class of “memory-aware” algorithms that aim at maximizing performance under memory constraints. These algorithms provide both accurate memory predictions and a robust solver. We illustrat...

متن کامل

Modeling 1D Distributed-Memory Dense Kernels for an Asynchronous Multifrontal Sparse Solver

To solve sparse linear systems multifrontal methods rely on dense partial LU decompositions of so-called frontal matrices; we consider a parallel, asynchronous setting in which several frontal matrices can be factored simultaneously. In this context, to address performance and scalability issues of acyclic pipelined asynchronous factorization kernels, we study models to revisit properties of le...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2012

Memory and performance issues in parallel multifrontal factorizations and triangular solutions with sparse right-hand sides

نویسنده

چکیده

منابع مشابه

Parallel Multifrontal Solution of Sparse Linear Least Squares Problems on Distributed-Memory Multiprocessors

Efficient Parallel Solutions of Large Sparse Spd Systems on Distributed-memory Multiprocessors

On Evaluating Parallel Sparse Cholesky Factorizations

Robust Memory-Aware Mappings for Parallel Multifrontal Factorizations

Modeling 1D Distributed-Memory Dense Kernels for an Asynchronous Multifrontal Sparse Solver

عنوان ژورنال:

اشتراک گذاری